In [1]:
import numpy as np
from scipy import signal
from astropy.table import Table
import matplotlib.pyplot as plt
%matplotlib inline
import matplotlib.pylab as pylab
pylab.rcParams['figure.figsize'] = 20, 20
In this worksheet we'll use the data from Sodor et al on XY And. The data is available directly from the MNRAS page for the article. You want the zip file mnr_21837_sm_Tables56A1-A5.zip
and from that Table A1, which is in the file mnr_21837_sm_TableA1.dat
.
We'll start by reading that file in.
In [2]:
xy_and_v = Table.read('xy_and/mnr_21837_sm_TableA1.dat', format='ascii')
xy_and_date = xy_and_v['HJD-2400000']
xy_and_mag = xy_and_v['DeltaV']
Our goal here is to figure out what signal (i.e. meaningful frequencies) are present in the data from a variable star. The approach will be:
We'll calculate the periodogram using a method called the Lomb-Scargle algorithm. That isn't going to be explained here beyond the bare minimum needed to understand how to use it.
One of the inputs to Lomb-Scargle is the frequencies at which you want to calculate the power. The amount of power (i.e. height of the curve) indicates how "significant" that frequency is.
For RR Lyrae stars periods typically range from a few hours to a day, so we'll start with a fairly narrow range of frequencies.
NOTE: ALL FREQUENCIES ARE IN CYCLES PER DAY UNLESS OTHERWISE NOTED.
That causes one little wrinkle, because Lomb-Scargle wants angular frequency, not plain frequency. Turns out not to be hard to deal with, though.
In [3]:
f_max=1/0.1 # cycles/day, corresponding to a period of 0.1 days
f_min = 1./3 #cycles per day, corresponding a to a period of 3 days.
frequency = np.linspace(f_min, f_max, num=1000) # For now, pretend more points is better.
In [4]:
periodgram = signal.lombscargle(xy_and_date, xy_and_mag, 2*np.pi*frequency)
In [5]:
plt.plot(frequency, periodgram)
plt.xlabel('Frequency (cycle/day)')
plt.ylabel('Power')
plt.title('Initial periodogram of XY And')
Out[5]:
In this periodogram it looks like most of the power is between 2 c/d and 4 c/d, so let's redo the periodogram focusing just on that range. Why? Because the height and location of the peaks depends on how you choose the frequencies at which to calculate the power. Don't trust that you have found the correct peak until you have made a fairly high resolution periodogram.
In this case I know, from Sodor et al, that the primary frequency is about 2.5 c/d, so the highest peak above (at about 3.5 c/d) isn't actually the primary frequency.
In [6]:
f_max=2 # cycles/day
f_min = 4 #cycles per day
frequency = np.linspace(f_min, f_max, num=1000) # For now, pretend more points is better.
periodgram = signal.lombscargle(xy_and_date, xy_and_mag, 2*np.pi*frequency)
In [7]:
plt.plot(frequency, periodgram)
plt.xlabel('Frequency (cycle/day)')
plt.ylabel('Power')
plt.title('Initial periodogram of XY And')
Out[7]:
now the primary peak is clearly at just about 2.5c/d. There is a peak near 3.5 c/d--that peak is an alias of the true frequency caused by the fact that data is only taken at night, so there is an artificial (in the sense that it has nothing to do with the star XY And) frequency of 1c/d in the system.
Let's home in on that peak near 2.5 c/d...
In [8]:
f_max=2.4 # cycles/day
f_min =2.6 #cycles per day
frequency = np.linspace(f_min, f_max, num=1000) # For now, pretend more points is better.
periodgram = signal.lombscargle(xy_and_date, xy_and_mag, 2*np.pi*frequency)
In [9]:
plt.plot(frequency, periodgram)
plt.xlabel('Frequency (cycle/day)')
plt.ylabel('Power')
plt.title('Initial periodogram of XY And')
plt.xlim(2.5, 2.53)
Out[9]:
The middle peak is the one we want. How do we know? First, it is the tallest one. Second, the peaks to either side are additional aliases corresponding to a frequency of roughly .0027 c/d, which corresponds to a period of 1 year.
If I were ambitious today, I'd fit a polynomial to that central peak and take its derivative to find the maximum. Instead, I'll eyeball it: looks like about 2.507 c/d.
In [10]:
f0 = 2.507 # c/d
p0 = 1/f0
Time to look at the actual data we have, and at a couple ways of checking whether it really varies with a frequency of roughly 2.507 c/d.
In [11]:
plt.plot(xy_and_date, - xy_and_mag, '.')
plt.xlabel('Heliocentric Julian Date')
plt.ylabel('-$\Delta V$ (magnitude)')
Out[11]:
A phased light curve is one where the horizontal axis is fraction of a period rather than date. You can't calculate the phase without two things:
The epoch can be any date you want--it acts as the zero point from which you measure times. Typically it is chosen to be a date at whicht star is at maximum brightness, but we won't worry about that for now.
In [12]:
epoch = 2454381.5513 - 2400000
phase = (xy_and_date-epoch)/p0 - np.int64((xy_and_date-epoch)/p0)
plt.plot(phase, -xy_and_mag, '.', color='b')
plt.plot(phase+1, -xy_and_mag, '.', color='b') # traditionally the phased light curve is repeated so you can see the full structure
plt.xlabel('phase')
plt.ylabel(u'-$\Delta V$')
Out[12]:
In [13]:
f0 = 2.507996119
p0 = 1/f0
phase = (xy_and_date-epoch)/p0 - np.int64((xy_and_date-epoch)/p0)
plt.plot(phase, -xy_and_mag, '.', color='b')
plt.plot(phase+1, -xy_and_mag, '.', color='b') # traditionally the phased light curve is repeated so you can see the full structure
plt.xlabel('phase')
plt.ylabel(u'-$\Delta V$')
Out[13]:
We'll start with the simplest possible model, one with just one sine wave. The SinusoidModel
class imported below provides an easy way to do two things:
NOTE: The fit you get will change as you change the model. This isn't a once-and-you-are-done kind of thing.
In [14]:
from sinusoid import SinusoidModel
First, specify the model. This will actually make more sense when the model gets more complicated.
In [15]:
xy_and_simple = SinusoidModel()
xy_and_simple.frequencies = [f0] # just one frequency in this first model
xy_and_simple.modes = [[1]] # and just one component at 1*f0
Now, fit the model to the data...
In [16]:
xy_and_simple.fit_to_data(xy_and_date, xy_and_mag)
Let's see if this is even remotely plausible by overplotting the data and our first fit.
In [17]:
fit_magnitudes = xy_and_simple(xy_and_date)
plt.plot(phase, -xy_and_mag, '.', label='Data')
plt.plot(phase, -fit_magnitudes, '.', label='Fit')
plt.xlabel('Phase')
plt.ylabel('-$\Delta V$ (mag)')
Out[17]:
You can even print out the model if you want to see the fit parameters:
In [18]:
print xy_and_simple
This goes by the name pre-whitening because you are making your data closer to white noise. Pre-whitening is actually pretty easy: subtract your model from the data and this difference is the pre-whitened data.
In [19]:
residual = xy_and_mag - xy_and_simple(xy_and_date)
Next, make a periodogram of this residual...we'll start by focusing on the frequency range from 2 c/d to 4 c/d because the impact of pre-whitening is clearest there.
In [20]:
f_max=2 # cycles/day
f_min = 4 #cycles per day
frequency = np.linspace(f_min, f_max, num=1000) # For now, pretend more points is better.
periodgram = signal.lombscargle(xy_and_date, xy_and_mag, 2*np.pi*frequency)
residual_periodogram = signal.lombscargle(xy_and_date, residual, 2*np.pi*frequency)
In [21]:
plt.plot(frequency, periodgram, label='Before pre-whitening')
plt.plot(frequency, residual_periodogram, label='After removing single-component model', linewidth=3)
plt.xlabel('Frequency (c/d)')
plt.ylabel('Power')
plt.legend()
Out[21]:
In [22]:
f_max=1 # cycles/day
f_min = 10 #cycles per day
frequency = np.linspace(f_min, f_max, num=1000) # For now, pretend more points is better.
residual_periodogram = signal.lombscargle(xy_and_date, residual, 2*np.pi*frequency)
In [23]:
plt.plot(frequency, residual_periodogram)
plt.xlabel('Frequency (c/d)')
plt.ylabel('Power')
plt.title('Pre-whitened spectrum of XY And')
Out[23]:
In [24]:
f_max=4.5 # cycles/day
f_min = 7.5 #cycles per day
frequency = np.linspace(f_min, f_max, num=1000) # For now, pretend more points is better.
residual_periodogram = signal.lombscargle(xy_and_date, residual-residual.mean(), 2*np.pi*frequency)
In [25]:
plt.plot(frequency, residual_periodogram)
plt.xlabel('Frequency (c/d)')
plt.ylabel('Power')
plt.title('Pre-whitened spectrum of XY And')
Out[25]:
It turns out the most significant peak is just above 5 c/d. It is not coincidence that this is twice the fundamental frequency. In Fourier decompositions you typically have not just a single frequency but a "fundamental" frequency and multiples of that fundamental. We'll end up seeing power at not just the first few multiples, but at many.
In [26]:
xy_and_2modes = SinusoidModel()
xy_and_2modes.frequencies = [f0] # Only completely different frequencies get entered separately
xy_and_2modes.modes = [[1], #first mode is 1*f0
[2] #second mode is 2*f0
]
As before, we fit this model to the original data.
In [27]:
xy_and_2modes.fit_to_data(xy_and_date, xy_and_mag)
And, like before, make a phased light curve with both modes included.
In [28]:
fit_magnitudes = xy_and_2modes(xy_and_date)
plt.plot(phase, -xy_and_mag, '.', label='Data')
plt.plot(phase, -fit_magnitudes, '.', label='Fit with two modes')
plt.xlabel('Phase')
plt.ylabel('-$\Delta V$ (mag)')
Out[28]:
Note that the components of this new fits are different than the components of the second fit even for the modes they have in common:
In [29]:
print "First model:"
print xy_and_simple
print "\n\nSecond model:"
print xy_and_2modes
This is exactly the same as the process we went through with the first model...remove the model from the data and make a power spectrum of the leftovers.
In [30]:
residual = xy_and_mag - xy_and_2modes(xy_and_date)
I'll first make a periodogram over a fairly broad range so we can see how this next round of pre-whitening has altered the original spectrum.
In [31]:
f_max=2 # cycles/day
f_min = 15 #cycles per day
frequency = np.linspace(f_min, f_max, num=1000) # For now, pretend more points is better.
periodgram = signal.lombscargle(xy_and_date, xy_and_mag, 2*np.pi*frequency)
residual_periodogram = signal.lombscargle(xy_and_date, residual, 2*np.pi*frequency)
In [32]:
plt.plot(frequency, periodgram, label='Before pre-whitening')
plt.plot(frequency, residual_periodogram, label='After removing two-component model', linewidth=3)
plt.xlabel('Frequency (c/d)')
plt.ylabel('Power')
plt.legend()
Out[32]:
The first thing to look for will be power near additional harmonics of the fundamental (i.e $3f_0$, $4f_0$, etc.)
Let's graph the pre-whitened spectrum (that is, the spectrum of the residuals). The harmonics of the fundamental are marked along with the aliases at $\pm 1$c/d.
I've extended the range of frequencies covered by the periodogram to include many harmonics (since I know they are actually present in the spectrum).
In [33]:
f_max=5 # cycles/day
f_min = 26 #cycles per day
frequency = np.linspace(f_min, f_max, num=3000) # For now, pretend more points is better.
residual_periodogram = signal.lombscargle(xy_and_date, residual, 2*np.pi*frequency)
plt.plot(frequency, residual_periodogram, label='spectrum after removing two-component model')
for i in range(2,10):
harmonic_freq = i*np.array([f0,f0])
harmonic_alias = harmonic_freq + 1
plt.plot(harmonic_freq, plt.ylim(), ':', color='green', label='{}$f_0$'.format(i))
plt.plot(harmonic_alias, plt.ylim(), ':', color='red', label='daily alias of harmonic {}$f_0$'.format(i))
ymin, ymax = plt.ylim()
plt.text(harmonic_freq[0] + 0.1, 0.9*ymax, '{}$f_0$'.format(i), color='green')
plt.text(harmonic_alias[0] + 0.1, 0.8*ymax, 'daily alias\nof {}$f_0$'.format(i), color='red')
plt.xlabel('Frequency (c/d)')
plt.ylabel('Power')
plt.xlim(frequency.min(), frequency.max())
#legend(ncol=5)
Out[33]:
In [34]:
xy_and_9modes = SinusoidModel()
xy_and_9modes.frequencies = [f0] # Only completely different frequencies get entered separately
xy_and_9modes.modes = [[1], #first mode is 1*f0
[2], #second mode is 2*f0
[3], # and so on...
[4],
[5],
[6],
[7],
[8],
[9]
]
In [35]:
xy_and_9modes.fit_to_data(xy_and_date, xy_and_mag)
In [36]:
fit_magnitudes = xy_and_9modes(xy_and_date)
plt.plot(phase, -xy_and_mag, '.', label='Data')
plt.plot(phase, -fit_magnitudes, '.', label='Fit with 9 modes')
plt.xlabel('Phase')
plt.ylabel('-$\Delta V$ (mag)')
Out[36]:
In [37]:
residual = xy_and_mag - xy_and_9modes(xy_and_date)
f_max= 10*f0-1 # cycles/day
f_min = 20*f0+1 #cycles per day
frequency = np.linspace(f_min, f_max, num=5000) # For now, pretend more points is better.
residual_periodogram = signal.lombscargle(xy_and_date, residual, 2*np.pi*frequency)
In [38]:
plt.plot(frequency, residual_periodogram)
for i in range(10,21):
harmonic_freq = i*np.array([f0,f0])
harmonic_alias = harmonic_freq + 1
plt.plot(harmonic_freq, plt.ylim(), ':', color='green', label='harmonic {}$f_0$'.format(i))
plt.plot(harmonic_alias, plt.ylim(), ':', color='red', label='daily alias of harmonic {}$f_0$'.format(i))
ymin, ymax = plt.ylim()
plt.text(harmonic_freq[0] + 0.1, 0.9*ymax, '{}$f_0$'.format(i), color='green')
plt.text(harmonic_alias[0] + 0.1, 0.8*ymax, 'daily alias\nof {}$f_0$'.format(i), color='red')
#legend()
In [39]:
residual = xy_and_mag - xy_and_9modes(xy_and_date)
f_max= 14*f0-0.5 # cycles/day
f_min = 14*f0+0.5 #cycles per day
frequency = np.linspace(f_min, f_max, num=5000) # For now, pretend more points is better.
residual_periodogram = signal.lombscargle(xy_and_date, residual, 2*np.pi*frequency)
In [40]:
plt.plot(frequency, residual_periodogram)
for i in range(14,15):
harmonic_freq = i*np.array([f0,f0])
harmonic_alias = harmonic_freq + 1
plt.plot(harmonic_freq, plt.ylim(), ':', color='green', label='harmonic {}$f_0$'.format(i))
plt.plot(harmonic_alias, plt.ylim(), ':', color='red', label='daily alias of harmonic {}$f_0$'.format(i))
plt.legend()
plt.xlim(14*f0-0.1, 14*f0+0.1)
Out[40]:
Notice that there are peaks near $14f_0$ that are higher than the peak at $14f_0$.
This is a sign that there is an additional frequency present in the data, something also clear from the scatter of the data around the fit up to this point. The peaks near $14f_0$ are offset from $14f_0$ by an amount $f_m$, where $f_m$ is the modulation frequency.
In [41]:
residual = xy_and_mag - xy_and_9modes(xy_and_date)
f_max= f0-0.5 # cycles/day
f_min = f0+0.5 #cycles per day
frequency = np.linspace(f_min, f_max, num=5000) # For now, pretend more points is better.
residual_periodogram = signal.lombscargle(xy_and_date, residual, 2*np.pi*frequency)
In [42]:
plt.plot(frequency, residual_periodogram)
for i in range(1,2):
harmonic_freq = i*np.array([f0,f0])
harmonic_alias = harmonic_freq + 1
plt.plot(harmonic_freq, plt.ylim(), ':', color='green', label='harmonic {}$f_0$'.format(i))
plt.plot(harmonic_alias, plt.ylim(), ':', color='red', label='daily alias of harmonic {}$f_0$'.format(i))
ymin, ymax = plt.ylim()
plt.text(harmonic_freq[0] + 0.1, 0.9*ymax, '{}$f_0$'.format(i), color='green')
plt.text(harmonic_alias[0] + 0.1, 0.8*ymax, 'daily alias\nof {}$f_0$'.format(i), color='red')
#legend()
plt.xlim(f0-0.1,f0+0.1)
Out[42]:
The tall peak offset to the right by about 0.02 c/d is the modulation frequency.
As when we found the main peak, we should really zoom in around this peak, fit a polynomial to it, and take the derivative of the fit to find the location of the peak. Instead, I'll pull the result from the paper: $f_m = 0.02417104$c/d
In [43]:
fm = 0.02417104 # c/d
You should expect power at the modulation frequency and at linear combinations of the fundamental frequency and the modulation frequency (e.g. $f_0+f_m$ or $3f_0 - 2f_m$, etc.).
To determine which are actually present you should look at the spectrum near each one of these expected peaks.
I'm going to skip much of that, and for the purposes of illustration I'm going to add in modes at $f_m$, $f_0+f_m$, $2f_0+f_m$, etc. up to $f_0+9f_m$ because the paper on which this is based includes all of those modes.
In [44]:
xy_and_modulated = SinusoidModel()
xy_and_modulated.frequencies = [f0, fm] # Now we have two independent frequencies
xy_and_modulated.modes = [[1, 0], #first mode is 1*f0 + 0*fm
[2, 0], #second mode is 2*f0 + 0*fm
[3, 0], # and so on...
[4, 0],
[5, 0],
[6, 0],
[7, 0],
[8, 0],
[9, 0],
[0, 1], # this is 0*f0 + f_m
[1, 1], # and 1*f0 + 1*fm
[2, 1], # and 2*f0+f_m
[3, 1], # and so on...
[4, 1],
[5, 1],
[6, 1],
[7, 1],
[8, 1],
[9, 1]
]
In [45]:
xy_and_modulated.fit_to_data(xy_and_date, xy_and_mag)
In [46]:
plt.plot(phase, xy_and_mag, '.', label='Data')
plt.plot(phase, xy_and_modulated(xy_and_date), '.', label='fit')
plt.legend()
Out[46]:
In [47]:
residual = xy_and_mag - xy_and_modulated(xy_and_date)
f_max= 1 # cycles/day
f_min = 10 #cycles per day
frequency = np.linspace(f_min, f_max, num=5000) # For now, pretend more points is better.
residual_periodogram = signal.lombscargle(xy_and_date, residual, 2*np.pi*frequency)
In [48]:
plt.plot(frequency, residual_periodogram)
for i in range(15,15):
harmonic_freq = i*np.array([f0,f0])
harmonic_alias = harmonic_freq + 1
harmonic_mod = harmonic_freq + fm
plt.plot(harmonic_freq, plt.ylim(), ':', color='green', label='harmonic {}$f_0$'.format(i))
plt.plot(harmonic_mod, plt.ylim(), ':', color='black', label='{}$f_0+f_m$'.format(i))
plt.plot(harmonic_alias, plt.ylim(), ':', color='red', label='daily alias of harmonic {}$f_0$'.format(i))
#legend()
#xlim(14*f0-0.1, 14*f0+0.1)
There is clearly still more power near the fundamental, its harmonics, and their aliases.
But it is getting really tedious making different plots for each one, so let's get a little more efficient.
In [49]:
def narrow_periodgram(time, amplitude, center_frequency, frequency_width, num=1000):
frequencies = np.linspace(center_frequency-frequency_width/2, center_frequency+frequency_width/2, num=num)
pgram = signal.lombscargle(time, amplitude, 2*np.pi*frequencies)
return (frequencies, pgram)
While we are writing functions, how about one that adds vertical markers at a particular set of frequencies?
In [50]:
def mark_at(freqs, labels, ax=None):
if ax is None:
ax = plt.axis()
for freq, label in zip(freqs, labels):
ax.plot([freq, freq], plt.ylim(), ':')
ax.text(1.01*freq, 0.9*plt.ylim()[1], label)
In [51]:
freq_width = 2*3.5*fm
# set up markers at the modulation frequencies
mark_freq = []
mark_label = []
for i in [-3, -2, -1, 1, 2, 3]:
mark_freq.append(i*fm)
mark_label.append('{}$f_m$'.format(i))
max_harmonic = 20 # make a plot for harmonics 1 through this number
nplots_per_row = 5 # but only make this many plots per row
for harm in range(max_harmonic):
if (harm % nplots_per_row) == 0:
fig, axs = plt.subplots(ncols=nplots_per_row, sharey=True)
axis_index = harm % nplots_per_row
ax = axs[axis_index]
if axis_index == int(nplots_per_row/2):
ax.set_title('Normalized peridograms near harmonics')
cen_freq = (harm+1)*f0
f, p = narrow_periodgram(xy_and_date, residual, cen_freq, freq_width)
cen_freq_label = '{}$f_0$'.format(harm+1)
ax.plot(f - cen_freq, p/p.max(), label=cen_freq_label)
mark_at(mark_freq, mark_label, ax=ax)
ax.set_xlabel('$f-$'+cen_freq_label)
ax.legend()
plt.subplots_adjust(wspace=0)
OK, that's nice, but what modes have I already removed? Let's print our current model
In [52]:
print xy_and_modulated
From the graphs above, it looks like we should really include multiples of the fundamental through $13f_0$ (which is what the authors chose too), the mode at $+f_m$ through $13f_0$ and at $16f_0$, the mode at $-f_m$ through $9f_0$. Looks like there is some power at $\pm 2f_m$ also, but let's take out all of the $\pm f_m$ stuff first.
In [53]:
xy_and_all_fm = SinusoidModel()
xy_and_all_fm.frequencies = xy_and_modulated.frequencies
xy_and_all_fm.modes = [[1, 0], #first mode is 1*f0 + 0*fm
[2, 0], #second mode is 2*f0 + 0*fm
[3, 0], # and so on...
[4, 0],
[5, 0],
[6, 0],
[7, 0],
[8, 0],
[9, 0],
[10, 0],
[11, 0],
[12, 0],
[13, 0],
[0, 1], # this is 0*f0 + f_m
[1, 1], # and 1*f0 + 1*fm
[2, 1], # and 2*f0+f_m
[3, 1], # and so on...
[4, 1],
[5, 1],
[6, 1],
[7, 1],
[8, 1],
[9, 1],
[10, 1],
[11, 1],
[12, 1],
[13, 1],
[16, 1],
[1, -1],
[2, -1],
[3, -1],
[4, -1],
[5, -1],
[6, -1],
[7, -1],
[8, -1],
[9, -1]
]
In [54]:
xy_and_all_fm.fit_to_data(xy_and_date, xy_and_mag)
In [55]:
plt.plot(phase, - xy_and_mag, '.', label='data')
plt.plot(phase, -xy_and_all_fm(xy_and_date), '.', label='data')
plt.legend()
Out[55]:
In [56]:
residual = xy_and_mag - xy_and_all_fm(xy_and_date)
plt.plot(phase, residual, '.')
plt.title('Residual after removing lots of signal')
Out[56]:
In [57]:
f_max= 1 # cycles/day
f_min = 10 #cycles per day
frequency = np.linspace(f_min, f_max, num=5000) # For now, pretend more points is better.
residual_periodogram = signal.lombscargle(xy_and_date, residual, 2*np.pi*frequency)
In [58]:
plt.plot(frequency, residual_periodogram)
Out[58]:
Notice how small the vertical scale is getting here. We are not left with a lot of signal still, but the graph of the residuals shows there is still some power, and the graphs above near the harmonics showed some power at $\pm 2f_m$, so let's remake those plots
In [59]:
freq_width = 2*3.5*fm
# set up markers at the modulation frequencies
mark_freq = []
mark_label = []
for i in [-3, -2, -1, 1, 2, 3]:
mark_freq.append(i*fm)
mark_label.append('{}$f_m$'.format(i))
max_harmonic = 20 # make a plot for harmonics 1 through this number
nplots_per_row = 5 # but only make this many plots per row
for harm in range(max_harmonic):
if (harm % nplots_per_row) == 0:
fig, axs = plt.subplots(ncols=nplots_per_row, sharey=True)
axis_index = harm % nplots_per_row
ax = axs[axis_index]
if axis_index == int(nplots_per_row/2):
ax.set_title('Normalized peridograms near harmonics')
cen_freq = (harm+1)*f0
f, p = narrow_periodgram(xy_and_date, residual, cen_freq, freq_width)
cen_freq_label = '{}$f_0$'.format(harm+1)
ax.plot(f - cen_freq, p/p.max(), label=cen_freq_label+' max is {:5f}'.format(p.max()))
mark_at(mark_freq, mark_label, ax=ax)
ax.set_xlabel('$f-$'+cen_freq_label)
ax.legend()
plt.subplots_adjust(wspace=0)
Let's now add modes $+2f_m$ up to $9f_0$ (after that there is still a peak, but its value is getting kind of small...not sure what the cutoff here should be), and modes at $-2f_m$ up to $4f_0$ and see where we stand.
In [60]:
xy_and_all_2fm = SinusoidModel()
xy_and_all_2fm.frequencies = xy_and_modulated.frequencies
xy_and_all_2fm.modes = [[1, 0], #first mode is 1*f0 + 0*fm
[2, 0], #second mode is 2*f0 + 0*fm
[3, 0], # and so on...
[4, 0],
[5, 0],
[6, 0],
[7, 0],
[8, 0],
[9, 0],
[10, 0],
[11, 0],
[12, 0],
[13, 0],
[0, 1], # this is 0*f0 + f_m
[1, 1], # and 1*f0 + 1*fm
[2, 1], # and 2*f0+f_m
[3, 1], # and so on...
[4, 1],
[5, 1],
[6, 1],
[7, 1],
[8, 1],
[9, 1],
[10, 1],
[11, 1],
[12, 1],
[13, 1],
[16, 1],
[1, -1],
[2, -1],
[3, -1],
[4, -1],
[5, -1],
[6, -1],
[7, -1],
[8, -1],
[9, -1],
[1, 2], # and 1*f0 + 1*fm
[2, 2], # and 2*f0+f_m
[3, 2], # and so on...
[4, 2],
[5, 2],
[6, 2],
[7, 2],
[8, 2],
[9, 2],
[1, -2],
[2, -2],
[3, -2],
[4, -2]
]
In [61]:
xy_and_all_2fm.fit_to_data(xy_and_date, xy_and_mag)
In [62]:
plt.plot(phase, - xy_and_mag, '.', label='data')
plt.plot(phase, -xy_and_all_2fm(xy_and_date), '.', label='data')
plt.legend()
Out[62]:
In [63]:
residual = xy_and_mag - xy_and_all_2fm(xy_and_date)
plt.plot(phase, residual, '.')
plt.title('Residual after removing lots of signal including $\pm 2f_m$ modes')
plt.xlabel('phase')
plt.ylabel('V magnitude')
Out[63]:
Not much signal left, but the data clearly spreads a bit towards the right/left edges, where the light curve is at maximum. Note well the vertical scale...we are now down to hundrenths of a magnitude!
In [64]:
f_max= 1 # cycles/day
f_min = 10 #cycles per day
frequency = np.linspace(f_min, f_max, num=5000) # For now, pretend more points is better.
residual_periodogram = signal.lombscargle(xy_and_date, residual, 2*np.pi*frequency)
In [65]:
plt.plot(frequency, residual_periodogram)
plt.xlabel('Frequency (c/d)')
Out[65]:
Not too much left here, let's see what we see around the peaks...only going up to $15f_0$ his time because there doesn't seem to be much real above that.
In [66]:
freq_width = 2*3.5*fm
# set up markers at the modulation frequencies
mark_freq = []
mark_label = []
for i in [-3, -2, -1, 1, 2, 3]:
mark_freq.append(i*fm)
mark_label.append('{}$f_m$'.format(i))
max_harmonic = 15 # make a plot for harmonics 1 through this number
nplots_per_row = 5 # but only make this many plots per row
for harm in range(max_harmonic):
if (harm % nplots_per_row) == 0:
fig, axs = plt.subplots(ncols=nplots_per_row, sharey=True)
axis_index = harm % nplots_per_row
ax = axs[axis_index]
if axis_index == int(nplots_per_row/2):
ax.set_title('Normalized peridograms near harmonics')
cen_freq = (harm+1)*f0
f, p = narrow_periodgram(xy_and_date, residual, cen_freq, freq_width)
cen_freq_label = '{}$f_0$'.format(harm+1)
ax.plot(f - cen_freq, p/p.max(), label=cen_freq_label+' max is {:5f}'.format(p.max()))
mark_at(mark_freq, mark_label, ax=ax)
ax.set_xlabel('$f-$'+cen_freq_label)
ax.legend()
plt.subplots_adjust(wspace=0)
In [67]:
print xy_and_all_2fm
Though it isn't clear to me how to decide when to cut things off, the original paper includes these modes that we don't have yet: $14 f_0+f_m$, $10f_0 +2f_m$ through $13f_0 + 2f_m$, and $6f_0 -2f_m$.
Let's put those in and see what, if anything, is left.
In [68]:
xy_and_all_2fm_from_paper = SinusoidModel()
In [69]:
xy_and_all_2fm_from_paper.frequencies = xy_and_all_2fm.frequencies
old_modes = list(xy_and_all_2fm.modes) # IF YOU DON'T PUT IN THE LIST YOU MODIFY THE ORIGINAL MODES. BAD.
old_modes.extend([[14,1], [10,2], [11,2], [12,2], [13,2], [6, -2]])
xy_and_all_2fm_from_paper.modes = old_modes
In [70]:
#old_modes = xy_and_all_2fm.modes
xy_and_all_2fm.modes
Out[70]:
In [71]:
xy_and_all_2fm_from_paper.fit_to_data(xy_and_date, xy_and_mag)
In [72]:
residual = xy_and_mag - xy_and_all_2fm_from_paper(xy_and_date)
plt.plot(phase, residual, '.')
plt.title('Residual after removing lots of signal including $\pm 2f_m$ modes')
plt.xlabel('phase')
plt.ylabel('V magnitude')
Out[72]:
In [73]:
f_max= 1 # cycles/day
f_min = 10 #cycles per day
frequency = np.linspace(f_min, f_max, num=5000) # For now, pretend more points is better.
residual_periodogram = signal.lombscargle(xy_and_date, residual, 2*np.pi*frequency)
In [74]:
plt.plot(frequency, residual_periodogram)
plt.xlabel('Frequency (c/d)')
Out[74]:
Still a few peaks in the range 5-8 c/d or so that aren't being removed, let's zoom in...
In [75]:
f_max= 8 # cycles/day
f_min = 5 #cycles per day
frequency = np.linspace(f_min, f_max, num=5000) # For now, pretend more points is better.
residual_periodogram = signal.lombscargle(xy_and_date, residual, 2*np.pi*frequency)
In [76]:
plt.plot(frequency, residual_periodogram)
plt.xlabel('Frequency (c/d)')
Out[76]:
The highest peak is near 7.7, let's zoom in more...
In [77]:
plt.plot(frequency, residual_periodogram)
plt.xlabel('Frequency (c/d)')
plt.xlim(7.7,7.75)
Out[77]:
Highest peak is around 7.72 c/d; the paper calls this $f_2'=7.72021921$ c/d. Time to add this in, which means re-typing all of the modes. Grrr....wait, maybe not. For all of the previous modes I just need to add a zero to the end of the list. Lets try that.
In [78]:
f2 = 7.72021921 #c/d
xy_and_with_f2 = SinusoidModel()
xy_and_with_f2.frequencies = [f0, fm, f2]
old_modes = list(xy_and_all_2fm_from_paper.modes)
new_modes = []
for mode in old_modes:
mode_copy = list(mode)
mode_copy.append(0) # don't know why this is needed...maybe numpy append is overriding builtin append?
new_modes.append(mode_copy)
new_modes.append([0,0,1]) # add the new mode for f2'
xy_and_with_f2.modes = new_modes
Let's make sure this is entered right...
In [79]:
print xy_and_with_f2
In [80]:
xy_and_with_f2.fit_to_data(xy_and_date, xy_and_mag)
In [81]:
residual = xy_and_mag - xy_and_with_f2(xy_and_date)
plt.plot(phase, residual, '.')
plt.title("Residual after removing lots of signal including $\pm 2f_m$ modes and $f_2'$")
plt.xlabel('phase')
plt.ylabel('V magnitude')
Out[81]:
Maybe a little less scatter than before? Not real convinced to be honest...the paper claims to have fit peaks at $f2'-f0$, which is about 5.22 c/d, and $f2'+f_0$ (about 10.23 c/d) and $f_2'+2f_0$ (around 12.74 c/d).
Let's make spectra in those regions...
In [82]:
f_max= 6 # cycles/day
f_min = 5 #cycles per day
frequency = np.linspace(f_min, f_max, num=5000) # For now, pretend more points is better.
residual_periodogram = signal.lombscargle(xy_and_date, residual, 2*np.pi*frequency)
In [83]:
plt.plot(frequency, residual_periodogram)
plt.xlabel('Frequency (c/d)')
Out[83]:
I don't see how the peak at 5.2 c/d is more significant than the one at about 5.92 c/d.
In [84]:
f_max= 13 # cycles/day
f_min = 10 #cycles per day
frequency = np.linspace(f_min, f_max, num=5000) # For now, pretend more points is better.
residual_periodogram = signal.lombscargle(xy_and_date, residual, 2*np.pi*frequency)
In [85]:
plt.plot(frequency, residual_periodogram)
plt.xlabel('Frequency (c/d)')
Out[85]:
Likewise, there are several peaks at about the same power as those claimed as real in the paper.
One last model, including all of the frequencies included in the final fit in the paper.
In [86]:
# If I were fine, upstanding scientist I'd do a real modification of the SinusoidModel to allow generating a new model from an old.
f2 = 7.72021921 #c/d
xy_and_with_all_in_paper = SinusoidModel()
xy_and_with_all_in_paper.frequencies = [f0, fm, f2]
old_modes = list(xy_and_with_f2.modes)
new_modes = []
for mode in old_modes:
mode_copy = list(mode)
new_modes.append(mode_copy)
new_modes.append([-1,0,1]) # -f0 + f2'
new_modes.append([1,0,1]) # f0 + f2'
new_modes.append([2,0,1]) # 2f0 + f2'
xy_and_with_all_in_paper.modes = new_modes
In [87]:
xy_and_with_all_in_paper.fit_to_data(xy_and_date, xy_and_mag)
In [88]:
final_residual = xy_and_mag - xy_and_with_all_in_paper(xy_and_date)
plt.plot(phase, -final_residual, '.')
plt.title('Residual from final model in paper')
plt.xlabel('Phase')
plt.ylabel('V magnitude')
Out[88]:
In any event, let's see if adding these extra modes cleaned up the odd peaks in the detail plots near the harmonics.
In [89]:
def plot_near_harmonics(time, amplitude, max_harmonic=20, nplots_per_row=5, freq_width=0.1):
for harm in range(max_harmonic):
if (harm % nplots_per_row) == 0:
fig, axs = plt.subplots(ncols=nplots_per_row, sharey=True)
axis_index = harm % nplots_per_row
ax = axs[axis_index]
if axis_index == int(nplots_per_row/2):
ax.set_title('Normalized peridograms near harmonics')
cen_freq = (harm+1)*f0
f, p = narrow_periodgram(xy_and_date, residual, cen_freq, freq_width)
cen_freq_label = '{}$f_0$'.format(harm+1)
ax.plot(f - cen_freq, p/p.max(), label=cen_freq_label+' max is {:5f}'.format(p.max()))
mark_at(mark_freq, mark_label, ax=ax)
ax.set_xlabel('$f-$'+cen_freq_label)
ax.legend()
plt.subplots_adjust(wspace=0)
In [90]:
freq_width = 2*3.5*fm
# set up markers at the modulation frequencies
mark_freq = []
mark_label = []
for i in [-3, -2, -1, 1, 2, 3]:
mark_freq.append(i*fm)
mark_label.append('{}$f_m$'.format(i))
plot_near_harmonics(xy_and_date, final_residual, freq_width=freq_width)
In [91]:
plot_near_harmonics(xy_and_date, final_residual-final_residual.mean(), freq_width=freq_width)
In [92]:
f_max= 0.1 # cycles/day
f_min = 10 #cycles per day
frequency = np.linspace(f_min, f_max, num=10000) # For now, pretend more points is better.
residual_periodogram = signal.lombscargle(xy_and_date, final_residual, 2*np.pi*frequency)
In [93]:
plt.plot(frequency, residual_periodogram)
plt.xlabel('Frequency (c/d)')
plt.ylabel('Power')
plt.title('Periodogram after prewhitening with full model from paper')
Out[93]:
In [94]:
print residual_periodogram.std()
In [95]:
print residual_periodogram.mean()
In [96]:
date_span = np.linspace(xy_and_date.min(), xy_and_date.max(), num=10000)
plt.plot(date_span[:2000], xy_and_with_all_in_paper(date_span[:2000]))
Out[96]:
In [96]: